Development of a Patient-Reported Outcomes Tool to Assess Pain and Discomfort in Autosomal Dominant Polycystic Kidney Disease

Visual Abstract


Contents
Literature Search Methodology Supplemental Table 1.Literature search terms Focus Group Methodology Results of Quantitative Analyses Supplemental Table 2. Item-level characteristics of the ADPKD-PDS (N=298) Supplemental Table 3. Distributional and internal consistency statistics of ADPKD-PDS domains (N=298) Supplemental Table 4. Correlations of the ADPKD-PDS with other instruments at baseline (N=298)

Literature Search Methodology
A literature search was conducted in 2010 to identify articles pertaining to health-related quality of life and other patient-reported outcomes (PROs) used in autosomal dominant polycystic kidney disease (ADPKD) studies.Medical Literature Analysis and Retrieval System Online (MEDLINE®) was used for the initial literature review.A list of the search terms that were used is provided in Supplemental Table 1.
A supplementary literature search was conducted in September 2014 to confirm findings and included the term "ADPKD pain."The literature search was limited to papers published in 2004-2014.

MeSH Terms EMTREE Terms Title/Abstract Terms Disease
• Polycystic kidney diseases HRQoL, health-related quality of life; MeSH, medical subject heading; PRO, patient-reported outcome.

Focus Group Methodology
Discussion guides were used to facilitate the focus groups.All study documents were translated into the appropriate languages by in-country specialized medical translators, all of whom had noted experience with renal disease translations.Focus groups in English-speaking locations were moderated by English speakers (the study research scientist and study investigator, respectively).Focus groups conducted in non-English speaking locations used trained moderators to facilitate the discussions in native languages.
These focus group discussions were monitored by the study investigator using simultaneous translation in the adjoining back room.Study investigators observed all focus group interactions, including before, during, and after the formal meeting took place.When a translator was being used, translation continued during these times to allow for such observations to be replicated in all sites.
Focus group recordings were transcribed verbatim by a professional vendor.For focus groups conducted in a non-English-speaking country, transcription was simultaneously conducted from the English interpretation audio file.During data transcription, participants were referred to by number only.If a participant was identified on the recording, his or her identity was changed to a number at the time of transcription.No data were linked to any particular participant.
Transcripts of focus group sessions were reviewed and codified for analyses.The data analyses conducted were primarily descriptive (e.g., means, standard deviations).Concepts were identified based on themes that were mentioned by 2 or more participants within a group.Saturation, the point at which no new concepts are identified, was considered to be achieved when no new concepts or themes were identified during subsequent focus groups or interviews.A saturation table was developed based on the participants' responses from the focus groups.

Item response theory
The item response theory analysis examined the ability of each item to distinguish among persons with different levels of the trait.Item discrimination generally ranges from 0.00 to 5.00; a larger discrimination parameter indicates an item can make finer distinctions between persons.and is often handled by removal of the item from the assessment. 3,4 uniform differential item function was observed for any item on the ADPKD-PDS, although nonuniform differential item function was observed for 4 items.Examination of residual regression plots and standardized beta-coefficients for these items revealed few obvious differences.For item 16, males tended to have a higher variation in the reported frequency of sharp pain ratings than femalesthis weakened the precision in the model, which was compounded by smaller number of males (n=59) than females (n=239) in the study or possible differences in sample characteristics of male versus female participants.Nothing stood out as a reason to drop these items or nor apply a different item scoring algorithm by gender.

Item-level psychometric statistics
The adjusted item-total correlations (correlation between the item score and the domain score) ranged from 0.69 to 0.77 for the Overall Pain Severity items, 0.80 to 0.83 for Dull Pain Severity items, 0.73 to 0.81 for Sharp Pain Severity items, and 0.85 to 0.91 for Discomfort Severity items, all of which exceed the recommended minimum value of 0.40. 5 Moreover, each item had a higher correlation with its own scale than with other scales in the measure, and its relationship with other conceptually related scales was stronger than with less directly related scales.These observations reflect the strong structural characteristics of the Pain Severity and Pain Interference scales.Logistic regression-based item bias analysis showed no substantial difference between genders.

Domain-level descriptive statistics
Internal consistency of the domains was high, indicating good reliability (Supplemental Table 3).
Cronbach's alpha for each domain ranged from 0.87 to 0.95 and, as one would expect, was higher with a Spearman-Brown correction to a 10-item scale (0.93-0.98).Also as expected, average correlations between items within each Pain Severity scale were generally high (0.71-0.85).Overall Pain Severity was influenced by the heterogeneity of the pain items across the three pain domains and had the lowest inter-item correlation (r ii =0.57).Substantial floor effects (the percentage of participants reporting the lowest score) were observed for all domains (13.1%-70.1%).The highest floor effects were observed for Sharp Pain Severity and Sharp Pain Interference, reflecting the relative rarity of this symptom.
Minimal ceiling effects (percentage of participants reporting the highest score) for the domains were observed (0.3%-4.0%).

Convergent validity
Correlations between the ADPKD-PDS Pain Severity scales and the BPI-SF Intensity scale (0.56-0.76) reflected strong convergence, as did correlations between the ADPKD-PDS Pain Interference scales and the BPI-SF Impact scale (0.59-0.84), likely because these scales have very similar measurement constructs (Supplemental Table 4).Correlations with the SF-12v2 were lower due to less similar, yet related constructs.The Pain Interference scales correlated more strongly with the SF-

Longitudinal characteristics
Test-retest reliability.Except for a single item in the Sharp Pain Interference scale, the test-retest correlations for all domains (0.73-0.90) exceeded the commonly accepted threshold of 0.70. 6The Cohen's d effect sizes for changes in item scores were low (0.01 to 0.12) and similar to those for domain scores (0.01 to 0.11).The distribution of change scores and the small standardized mean

12v2
Physical Component Scale and Mental Component Scale (-0.37 to -0.67) than did the Pain Severity scales (-0.35 to -0.58).Correlations were also moderate to high between ADPKD-PDS scales and the ADPKD-IS Physical, Fatigue, and Emotional scales (0.37-0.76 for ADPKD-PDS Pain Severity scales and 0.40-0.86 for ADPKD-PDS Pain Interference scales).Taken together, these results provide evidence that both the ADPKD-PDS Pain Severity and Pain Interference scales possess good convergent validity characteristics.
differences in domain scores between the first and second administration of the ADPKD-PDS indicate little change in symptoms over time, consistent with the slow disease progression of ADPKD.Responsiveness to change.Understanding what constitutes meaningful change helps clinicians make treatment decisions.Distribution-based responder analysis of changes in scores using a responder definition based on the standard error of measurement indicated that a change of 0.2-0.5 points (4%-10%) or more in either direction represents meaningful change for six of the seven ADPKD-PDS scales.An exception is the Sharp Pain Interference scale, which requires a change of a full point to be considered clinically meaningful (this is a single-item scale; a change of less than a full point cannot be estimated using a single item).Anchor-based responder analysis, in which other instruments are used as the reference for change, showed that five of the seven ADPKD-PDS scales were responsive to change in the pain-based global rating of change, six scales were responsive to change in the BPI-SF Pain Intensity scale, and all three Pain Interference scales were responsive to change in the BPI-SF Impact scale.This indicates that the ADPKD-PDS captured small but meaningful change over time.

Table 2 .
Item-level characteristics of the ADPKD-PDS (N=298) No respondents endorsed the Extreme rating for their average dull pain.
a Items 9 and 22 were removed from the Pain Interference scales during CFA modeling.b

Table 3 .
Distributional and internal consistency statistics of ADPKD-PDS domains (N=298) ii , inter-item correlation; SD, standard deviation.a Percentage of patients with lowest observed score in the range b Percentage of patients with highest observed score in the range c Cronbach's alpha with a Spearman-Brown correction to a 10-item scale